Skip to content

netstack: add ICMP echo forwarder reply API#13413

Open
Amaindex wants to merge 1 commit into
google:masterfrom
Amaindex:netstack-icmp-echo-forwarder-reply
Open

netstack: add ICMP echo forwarder reply API#13413
Amaindex wants to merge 1 commit into
google:masterfrom
Amaindex:netstack-icmp-echo-forwarder-reply

Conversation

@Amaindex

Copy link
Copy Markdown
Contributor

Overview

This is the next ICMP Echo follow-up after #13189 and #13281.

It adds an icmp.Forwarder adapter for ICMP packets delivered through stack.SetTransportProtocolHandler, and lets the handler explicitly delegate ICMP Echo Reply generation back to netstack with ForwarderRequest.Reply().

The goal is to let embedders handle ICMP Echo Requests without reimplementing the built-in reply construction logic downstream.

Behavior

The default behavior remains conservative:

no ICMP default handler                   -> built-in Echo Reply is preserved
default handler returns false, no Reply() -> built-in Echo Reply is preserved
default handler returns true, no Reply()  -> no built-in Echo Reply
registered ICMP endpoint                  -> default handler is not called; built-in Echo Reply is preserved
IPv4 LocalAddressTemporary                -> built-in Echo Reply is still suppressed unless the handler explicitly calls Reply()

The new explicit delegation path is:

f := icmp.NewForwarder(func(r *icmp.ForwarderRequest) bool {
    if shouldHandleByTunnel(r) {
        return handleByTunnel(r)
    }
    return r.Reply()
})

s.SetTransportProtocolHandler(icmp.ProtocolNumber4, f.HandlePacket)
s.SetTransportProtocolHandler(icmp.ProtocolNumber6, f.HandlePacket)

Reply() is idempotent. If a handler calls Reply() and then returns false, netstack does not synthesize a second built-in Echo Reply on the fallback path. A successful Reply() return means the built-in reply path was available and accepted for this request; it is not a packet-delivery guarantee.

API Semantics

This API keeps the existing SetTransportProtocolHandler ownership contract:

handler returns true   -> the handler consumed the packet
handler returns false  -> the handler did not consume the packet; protocol fallback may continue

ForwarderRequest.Reply() is a separate explicit action. It asks netstack to synthesize the built-in Echo Reply for this request, without changing what the handler's return value means.

That gives handlers two independent decisions:

return true without Reply()   -> consume/drop/forward the Echo Request; no built-in reply
return false without Reply()  -> do not consume it; preserve the current fallback behavior
Reply(); return true          -> explicitly send the built-in reply and consume the request
Reply(); return false         -> explicitly send the built-in reply, then allow fallback to continue without sending a duplicate reply

I kept the final API as bool + Reply() because it fits the existing default-handler contract and keeps the user-facing surface small. An action enum could model this, but it would need to combine two distinct decisions: whether the handler consumed the packet, and whether it asked netstack to send the built-in Echo Reply. Keeping Reply() as an explicit request method avoids redefining SetTransportProtocolHandler semantics for ICMP.

LocalAddressTemporary

This change keeps the no-handler IPv4 LocalAddressTemporary behavior from #11609 and #13189: temporary local addresses do not automatically trigger a built-in Echo Reply.

The new capability is explicit delegation. If the ICMP default handler sees r.PacketInfo().LocalAddressTemporary and decides that netstack should still generate the Echo Reply, it can call r.Reply(). In that path, the reply uses the original Echo Request destination address as the reply source, and the reply path can find a route without requiring that temporary address to be registered as the route local address.

That addresses the large-address-space embedder case discussed in #13189 without changing the conservative default for tunnel/proxy handlers that intentionally consume Echo Requests.

This does not add an IPv6 Temporary()-based Echo Reply suppression rule.

Implementation Notes

The user-facing adapter is intentionally small:

  • icmp.NewForwarder(handler) constructs a handler adapter.
  • Forwarder.HandlePacket can be passed to stack.SetTransportProtocolHandler.
  • ForwarderRequest.ID() exposes the parsed ICMP identifier through stack.TransportEndpointID.
  • ForwarderRequest.PacketBuffer() exposes the packet for the duration of the handler call.
  • ForwarderRequest.PacketInfo() exposes network-layer metadata such as LocalAddressTemporary.
  • ForwarderRequest.Reply() delegates to the built-in Echo Reply path.

The actual Echo Reply construction remains in the IPv4/IPv6 network endpoint layer. transport/icmp does not duplicate route lookup, source address selection, rate limiting, stats, checksum handling, IPv4 options, IPv6 traffic class, or output hooks.

The per-packet reply hook is not the intended embedder-facing API. Embedders should use ForwarderRequest.Reply(); the exported PacketBuffer hook helpers are a cross-package bridge that lets the network endpoint path expose its built-in reply operation to the ICMP forwarder without moving protocol-specific reply construction into transport/icmp.

The hook is deliberately scoped to packet delivery:

  • PacketBuffer stores a private transient hook object.
  • NetworkPacketInfo remains metadata-only.
  • ForwarderRequest is invalidated after the handler returns.
  • PacketBuffer.Clone() does not copy the reply hook.
  • The IPv4/IPv6 ICMP paths clear the hook after transport delivery.

This keeps the hook from escaping past the temporary reply buffers prepared by the network endpoint.

Tests

The added tests cover:

  • IPv4 and IPv6 Reply() delegation.
  • Reply() idempotence when a handler returns false after replying.
  • Forwarder consuming an Echo Request without Reply().
  • Reply() returning false for non-Echo-Request packets.
  • saved request / saved PacketBuffer invalidation after handler return.
  • cloned PacketBuffer not inheriting the reply hook.
  • IPv4 temporary-address default suppression and explicit Reply() override.
go run github.com/bazelbuild/bazelisk@latest test --nocache_test_results \
  //pkg/tcpip/network/ipv4:ipv4_test \
  //pkg/tcpip/network/ipv6:ipv6_test \
  //pkg/tcpip/network/ipv6:ipv6_x_test \
  //pkg/tcpip/transport/icmp:icmp_x_test \
  //pkg/tcpip/transport/icmp:icmp_nogo \
  //pkg/tcpip/stack:stack_test \
  //pkg/tcpip/stack:stack_x_test \
  //pkg/tcpip/stack:stack_nogo

Related

Add an ICMP forwarder adapter that lets embedders install a per-stack ICMP handler through SetTransportProtocolHandler and explicitly delegate selected Echo Requests back to netstack with ForwarderRequest.Reply().

Keep the existing handler ownership contract unchanged: returning true consumes the packet, returning false leaves fallback behavior available, and Reply is a separate request to synthesize the built-in Echo Reply. The built-in reply construction remains in the IPv4 and IPv6 network endpoint paths so transport/icmp does not duplicate route lookup, source address selection, rate limiting, stats, checksum handling, IPv4 options, IPv6 traffic class handling, or output hooks.

Preserve the conservative default behavior for IPv4 LocalAddressTemporary addresses while allowing a handler to opt in to the built-in reply path for selected requests. Reply uses the original Echo Request destination as the reply source in that explicit path.

The per-packet reply hook is transient, is cleared after transport delivery, is invalidated when the forwarder handler returns, and is not copied by PacketBuffer.Clone().

Signed-off-by: Zi Li <zi.li@linux.dev>

Signed-off-by: Amaindex <amaindex@outlook.com>
@Amaindex Amaindex force-pushed the netstack-icmp-echo-forwarder-reply branch from cf6051b to dc1b84e Compare June 11, 2026 01:40
@Amaindex

Copy link
Copy Markdown
Contributor Author

Thanks for the earlier reviews and discussion on this series. With the default handler behavior from #13189 and #13281 in place, this follow-up is the piece that makes the ICMP Echo path easier to use from embedder code.

I wanted to leave a few concrete usage notes here, because the interesting part of this API is how an embedder chooses between letting netstack answer an Echo Request and taking ownership of the request itself.

cc @nybidari, @ericpauley, @dyhkwong

The main point is that embedders no longer need to reimplement gVisor's built-in ICMP Echo Reply construction just to let netstack answer a request in selected cases. A handler can still own the request when it wants to proxy/drop/forward it, but it can now explicitly delegate the built-in reply back to netstack.

Some typical handler shapes are:

Tunnel/proxy owns the request and suppresses the built-in reply:

f := icmp.NewForwarder(func(r *icmp.ForwarderRequest) bool {
    if proxy.SendICMPEcho(r) == nil {
        return true
    }

    return false
})

Inside that helper, the proxy should copy the request fields it needs before returning from the handler. For example:

type proxyICMPEchoRequest struct {
    nic         tcpip.NICID
    netProto    tcpip.NetworkProtocolNumber
    source      tcpip.Address
    destination tcpip.Address
    echo        []byte // ICMP Echo header plus payload.
}

func (p *Proxy) SendICMPEcho(r *icmp.ForwarderRequest) error {
    pkt := r.PacketBuffer()
    id := r.ID()
    echo := stack.PayloadSince(pkt.TransportHeader())
    defer echo.Release()

    req := proxyICMPEchoRequest{
        nic:         pkt.NICID,
        netProto:    pkt.NetworkProtocolNumber,
        source:      id.RemoteAddress,
        destination: id.LocalAddress,
        echo:        echo.ToSlice(),
    }

    go p.forwardICMPEcho(req)
    return nil
}

ForwarderRequest and its PacketBuffer() are only valid during the handler call, so asynchronous proxy code must not save either object. It should copy the fields or packet bytes it needs before returning.

If the proxy later wants to report measured upstream/proxy latency, it should write a synthetic Echo Reply itself. I also ran a small TUN smoke test to exercise that path end to end.

Proxy-owned synthetic Echo Reply smoke test

The host-side setup was:

sudo ip tuntap del dev gvicmp0 mode tun 2>/dev/null || true
sudo ip tuntap add dev gvicmp0 mode tun
sudo ip addr add 11.0.0.1/24 dev gvicmp0
sudo ip link set gvicmp0 up

The test program configured the netstack address as 11.0.0.2/24. Its handler returned true without calling Reply(), copied the Echo Request fields and bytes, and a mock proxy path later wrote a synthetic Echo Reply with FindRoute and Route.WritePacket after a 4.242s delay.

Terminal 1:

sudo ./icmp_proxy_tun -tun=gvicmp0 -addr=11.0.0.2 -delay=4242ms

Terminal 2:

ping -n -i 5 11.0.0.2

Observed from the host:

64 bytes from 11.0.0.2: icmp_seq=1 ttl=64 time=4245 ms
64 bytes from 11.0.0.2: icmp_seq=2 ttl=64 time=4246 ms

Harness logs:

proxy captured echo request src=11.0.0.1 dst=11.0.0.2 id=9 seq=1 bytes=64
proxy sent synthetic echo reply dst=11.0.0.1 delay=4.242s
proxy captured echo request src=11.0.0.1 dst=11.0.0.2 id=9 seq=2 bytes=64
proxy sent synthetic echo reply dst=11.0.0.1 delay=4.242s

The route-based writer in that test looked like this:

Route-based synthetic Echo Reply writer
func (p *Proxy) writeSyntheticEchoReply(req proxyICMPEchoRequest) tcpip.Error {
    r, err := p.stack.FindRoute(req.nic, req.destination, req.source, req.netProto, false /* multicastLoop */)
    if err != nil {
        return err
    }
    defer r.Release()

    switch req.netProto {
    case ipv4.ProtocolNumber:
        if len(req.echo) < header.ICMPv4MinimumSize {
            return &tcpip.ErrInvalidEndpointState{}
        }

        echo := append([]byte(nil), req.echo...)
        icmp := header.ICMPv4(echo)
        icmp.SetType(header.ICMPv4EchoReply)
        icmp.SetChecksum(0)
        icmp.SetChecksum(^checksum.Checksum(echo, 0))

        pkt := stack.NewPacketBuffer(stack.PacketBufferOptions{
            ReserveHeaderBytes: int(r.MaxHeaderLength()) + header.ICMPv4MinimumSize,
            Payload:            buffer.MakeWithData(echo[header.ICMPv4MinimumSize:]),
        })
        defer pkt.DecRef()
        pkt.TransportProtocolNumber = header.ICMPv4ProtocolNumber
        copy(header.ICMPv4(pkt.TransportHeader().Push(header.ICMPv4MinimumSize)), echo[:header.ICMPv4MinimumSize])

        return r.WritePacket(stack.NetworkHeaderParams{
            Protocol: header.ICMPv4ProtocolNumber,
            TTL:      r.DefaultTTL(),
            TOS:      stack.DefaultTOS,
        }, pkt)

    case ipv6.ProtocolNumber:
        if len(req.echo) < header.ICMPv6EchoMinimumSize {
            return &tcpip.ErrInvalidEndpointState{}
        }

        echo := append([]byte(nil), req.echo...)
        payload := echo[header.ICMPv6EchoMinimumSize:]
        pkt := stack.NewPacketBuffer(stack.PacketBufferOptions{
            ReserveHeaderBytes: int(r.MaxHeaderLength()) + header.ICMPv6EchoMinimumSize,
            Payload:            buffer.MakeWithData(payload),
        })
        defer pkt.DecRef()

        icmp := header.ICMPv6(pkt.TransportHeader().Push(header.ICMPv6EchoMinimumSize))
        pkt.TransportProtocolNumber = header.ICMPv6ProtocolNumber
        copy(icmp, echo[:header.ICMPv6EchoMinimumSize])
        icmp.SetType(header.ICMPv6EchoReply)
        icmp.SetChecksum(0)
        icmp.SetChecksum(header.ICMPv6Checksum(header.ICMPv6ChecksumParams{
            Header:      icmp,
            Src:         r.LocalAddress(),
            Dst:         r.RemoteAddress(),
            PayloadCsum: pkt.Data().Checksum(),
            PayloadLen:  pkt.Data().Size(),
        }))

        return r.WritePacket(stack.NetworkHeaderParams{
            Protocol: header.ICMPv6ProtocolNumber,
            TTL:      r.DefaultTTL(),
            TOS:      stack.DefaultTOS,
        }, pkt)
    }

    return &tcpip.ErrUnknownProtocol{}
}

This manual path is intentionally different from r.Reply(): Reply() asks netstack to send the built-in reply immediately, while a proxy-owned synthetic reply can be delayed until the proxy has measured or selected the real upstream path. Production code may also want to preserve request TOS/TrafficClass, handle IPv4 options, account for its own rate limiting or stats, or use a header-included/raw path when the desired reply source is not a normal route-selectable local address.

Large-address-space / temporary-address embedder delegates selected requests back to netstack:

f := icmp.NewForwarder(func(r *icmp.ForwarderRequest) bool {
    if r.PacketInfo().LocalAddressTemporary {
        return r.Reply()
    }

    return false
})

Observer-style handler records metadata but leaves fallback behavior unchanged:

f := icmp.NewForwarder(func(r *icmp.ForwarderRequest) bool {
    observe(r.ID(), r.PacketInfo())
    return false
})

Dual-stack embedders can install the same handler shape for IPv4 and IPv6:

f := icmp.NewForwarder(handleICMPEcho)

s.SetTransportProtocolHandler(icmp.ProtocolNumber4, f.HandlePacket)
s.SetTransportProtocolHandler(icmp.ProtocolNumber6, f.HandlePacket)

For the large-address-space / temporary-address case, Reply() lets the handler preserve the original Echo Request destination as the reply source without copying route lookup, source address selection, rate limiting, stats, checksum handling, IPv4 options, IPv6 traffic class handling, or output hooks into downstream code.

For tunnel/proxy-style handlers, the ownership rule remains explicit: returning true without calling Reply() consumes the request and prevents the built-in reply. Returning false without calling Reply() leaves the existing fallback behavior unchanged.

I kept this as bool + Reply() rather than an action enum so the existing SetTransportProtocolHandler return value continues to mean only whether the handler consumed the packet. Reply() is the separate explicit delegation action.

@ericpauley

Copy link
Copy Markdown
Contributor

Thanks for this excellent pull request and synopsis (and particularly for taking our use case feedback into account). I believe the "Large-address-space / temporary-address embedder" scenario will fully cover our use case.

@nybidari nybidari self-requested a review June 12, 2026 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants